fix(data): prompt-length filtering crashes on VLM dataset with apply_chat_template by Meihan-chen · Pull Request #2126 · THUDM/slime

Meihan-chen · 2026-06-23T09:48:14Z

Repro

Prompt-length filtering crashes for multimodal (VLM) datasets when --apply-chat-template is set, making the feature unusable. Reproduced end-to-end on the geo3k VLM example with --apply-chat-template + --rollout-max-prompt-len:

slime/utils/data.py:111, in filter_long_prompt
    multimodal_inputs = process_vision_info(sample.prompt, processor)
TypeError: string indices must be integers, not 'str'

Root cause

In filter_long_prompt (slime/utils/data.py), the multimodal branch re-derived vision info from sample.prompt:

multimodal_inputs = process_vision_info(sample.prompt, processor)
processor_output = processor(text=sample.prompt, **multimodal_inputs)

With apply_chat_template=True, Sample.prompt is the rendered string, but filter_long_prompt passed it to process_vision_info, which expects a conversation list → crash. The vision inputs are already computed and stored in Sample.multimodal_inputs, so this recomputation is both wrong and redundant.

Why it's easy to hit

Setting --rollout-max-context-len (which derives rollout_max_prompt_len), --rollout-max-prompt-len, or --eval-max-prompt-len activates the filter on a VLM dataset and trips the crash.

Fix

Reuse the multimodal inputs already stored on the sample, routed through the same build_processor_kwargs helper the rollout path (sglang_rollout) uses, so the token length measured during filtering matches the real pipeline:

processor_kwargs = build_processor_kwargs(sample.multimodal_inputs)
processor_output = processor(text=sample.prompt, **processor_kwargs)

filter_long_prompt re-extracted vision info from sample.prompt via process_vision_info in the multimodal branch. When apply_chat_template is set, sample.prompt is the rendered *string* (not a conversation list), so process_vision_info -> qwen_vl_utils crashed with "TypeError: string indices must be integers, not 'str'". This made prompt-length filtering unusable for any VLM dataset: setting --rollout-max-context-len (which derives rollout_max_prompt_len) or --rollout-max-prompt-len / --eval-max-prompt-len activates the filter and hits the crash. Reuse the multimodal inputs already computed during dataset construction via build_processor_kwargs (matching the sglang_rollout path) instead of recomputing them from the string prompt. Add CPU unit tests covering the multimodal branch and a mixed text-only + multimodal dataset. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: Meihan-chen <zr010426ztt@outlook.com>

Signed-off-by: Meihan-chen <zr010426ztt@outlook.com>

Meihan-chen changed the title ~~fix(data): reuse stored multimodal_inputs in filter_long_prompt to fix VLM length-filtering crash~~ fix(data): prompt-length filtering crashes on any VLM dataset with apply_chat_template Jun 23, 2026

Meihan-chen force-pushed the fix/multimodal-length-filter branch from 7447cd0 to a4560f6 Compare June 23, 2026 10:00

Meihan-chen changed the title ~~fix(data): prompt-length filtering crashes on any VLM dataset with apply_chat_template~~ fix(data): prompt-length filtering crashes on VLM dataset with apply_chat_template Jun 23, 2026

del test

0e0ec78

Signed-off-by: Meihan-chen <zr010426ztt@outlook.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(data): prompt-length filtering crashes on VLM dataset with apply_chat_template#2126

fix(data): prompt-length filtering crashes on VLM dataset with apply_chat_template#2126
Meihan-chen wants to merge 2 commits into
THUDM:mainfrom
Meihan-chen:fix/multimodal-length-filter

Meihan-chen commented Jun 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Meihan-chen commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Repro

Root cause

Why it's easy to hit

Fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Meihan-chen commented Jun 23, 2026 •

edited

Loading